fix(manifest/bazel): harden Maven extraction completeness and show_extension handling#1364
Merged
Merged
Conversation
The mod show_extension argv builders for the Maven and pip extensions did not pass --lockfile_mode=off, so a read-only dependency scan could rewrite the user's MODULE.bazel.lock. Mirror the query/cquery builders and pin the lockfile read-only before user flags.
…mits The cquery timeout detection tested for a `timedOut` flag the registry spawn never sets; on a timeout the child is killed and Node reports `killed: true` with `signal: SIGTERM` (or SIGKILL). Drop the phantom `timedOut` branch and update the test to assert the real kill shape.
A Bazel Maven extraction that under-reports dependencies must never be
presented downstream as a complete SBOM. This reworks the shared status
model so the CLI is honest about partial and unanalyzable runs:
- Add an `indeterminate` probe state. An unrecognized non-zero probe exit
or a thrown probe is no longer swallowed as `not-defined` ("no Maven
here"); it is propagated so the run is never reported complete.
- A workspace that fails to load is recorded as a load failure rather than
silently skipped: a hard failure when nothing else was analyzable, a
partial otherwise.
- Enrich the extraction result with a machine-readable completeness signal
(a `complete` flag plus per-workspace / per-hub outcomes) and write a
completeness summary alongside the manifests for downstream consumers.
The partial SBOM is still emitted and the warning stays loud.
- Make both the explicit command and auto-manifest honest: exit non-zero
only on hard failure, exit 0 on partial with a prominent warning and the
completeness signal echoed.
- Gate synthetic hub manifests on committed lockfiles: when a committed
maven_install.json / <hub>_maven_install.json already covers a hub, skip
re-emitting it since the server already ingests committed lockfiles.
- Wire a --per-repo-timeout flag (and socket.json setting) with a 120s
default for the explicit command, longer than the 60s auto-manifest
default; drop the misleading comment claiming the default already existed.
Per-state diagnostics are logged under --verbose.
…lookup Array.prototype.at requires the es2022 lib; the stricter typechecker flagged it on the mock.calls tuple type. Use length-1 index access, which is lib-agnostic.
…ce root The committed-lockfile gate walked the whole tree under a workspace root and matched a maven_install.json at any depth. A nested workspace or test fixture lockfile then wrongly marked the root @maven hub as already covered, so its synthetic manifest was skipped and its distinct coordinates were never emitted while the run could still report complete. Scope the gate to depth-0: a committed lockfile only covers the workspace it lives directly in, since each workspace is analyzed independently and the server walker ingests every committed lockfile against its own workspace. Also exclude the CLI's own synthetic output directories (.socket-auto-manifest, bazel-manifests) by name so a stale prior-run manifest can never be misread as a committed lockfile. Mark Maven hub enumeration indeterminate when `bazel mod show_extension` fails to execute (non-zero exit), distinct from a clean run that finds no maven extension. A failed enumeration may have missed custom-named hubs, so the run is reported known-incomplete (partial when other hubs succeeded, hard failure when nothing is analyzable) instead of silently falling back to conventional-name probes and reporting complete.
…f blanket-flagging bazel mod show_extension for the maven extension exits non-zero on every bzlmod repo that doesn't depend on rules_jvm_external, because its argument resolution throws before any Starlark runs. Treating any non-zero exit as an indeterminate enumeration wrongly flipped those legitimate no-Maven repos to a hard failure, aborting the user's whole auto-manifest scan. Classify the result by stderr shape via classifyShowExtensionResult: - extension/module not in the dependency graph (arg-resolution error) is a legitimate not-defined, contributing no Maven (noEcosystem) - a genuine MODULE.bazel evaluation/load failure, or a missing binary (normalized code -1), stays indeterminate so the run is never complete - an unrecognized non-zero exit biases to not-defined; only positive eval-failure stderr escalates to indeterminate Replace the old hard-fail test (whose mock was a legitimate no-Maven repo) with a noEcosystem assertion, and add coverage for the not-in-graph and genuine-eval-failure paths.
The not-in-graph classifier relied on the conservative default for Bazel's
actual phrasing ('No module with the apparent repo name X exists in the
dependency graph'). Add that verified wording explicitly and pin both the
real not-in-graph (exit 2 -> not-defined) and real unbound-name eval failure
(exit 2 -> indeterminate, eval-failure checked first) with tests.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using high effort and found 1 potential issue.
Bugbot Autofix is ON. A cloud agent has been kicked off to fix the reported issue.
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit 632d5e9. Configure here.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

What
Test driven hardening of the maven bazel extractor, bundled together in one PR. I found these when expanding the corpus with more tests.
Fixes
One bad one:
--lockfile_mode=offonshow_extensionso a read-only scan nevermutates the user's
MODULE.bazel.lock.Mostly cosmetic:
hubs succeed, others fail) the SBOM is still emitted, but the run now carries
a machine-readable completeness signal and warns loudly naming the incomplete
hubs, on both the
--auto-manifestand explicitsocket manifest bazelpaths. A run is only reported complete when it is.
that skips re-emitting a hub already covered by a committed
maven_install.jsonnow checks only the hub's own workspace root, not thewhole subtree. A nested or fixture lockfile can no longer mask an uncovered
root hub and drop its coordinates.
bazel mod show_extensionnon-zero exits are classified, not blanket-handled.A repo that simply doesn't use
rules_jvm_externalexits non-zero(
No module ... exists in the dependency graph) — that's an authoritative"no Maven here" and must not fail the scan. A genuine
MODULE.bazelevaluation failure (unbound name, syntax error) is indeterminate and must not
be reported as complete. The classifier distinguishes them; the
not-in-graph and eval-failure stderr shapes were verified against real
bazel mod show_extensionoutput.emits (was checking a shape that never occurs).
Testing
pnpm check(lint + type-check): clean (one pre-existing, unrelated lintwarning on
v1.xleft untouched).pnpm test:unitfor the Bazel + auto-manifest suites: all green (287 passed)with dist built.
show_extensionclassification was validated against realbazeloutput on JVM and non-JVM bzlmod repos and a forced
MODULE.bazelevaluation failure.
Note
Medium Risk
Changes dependency-discovery and completeness semantics for security scanning; misclassification could still under-report or over-fail scans, but behavior is heavily unit-tested and biases toward not aborting no-Maven repos.
Overview
Hardens Bazel Maven manifest extraction so incomplete or ambiguous scans are not presented as complete SBOMs.
Completeness signaling: Extraction returns
complete, per-workspace/per-hubworkspaceOutcomes, and writessocket-bazel-manifest-summary.json. Runs arecompleteonly when every hub succeeded, no workspace failed to load, and no probe orshow_extensionenumeration was indeterminate. Partial runs still emit manifests but get prominent CLI/auto-manifest warnings pluscomplete=falselogging.Discovery correctness: Adds
classifyShowExtensionResultto separate “norules_jvm_external/ no Maven” (authoritative not-defined, including common non-zero exits) from realMODULE.bazeleval failures (indeterminate). Hub probes and thrown probes now map unrecognized failures to indeterminate instead of silent not-defined.Committed lockfile gate: Skips synthetic manifests only when the hub’s lockfile (
maven_install.jsonor<hub>_maven_install.json) sits at depth-0 in that workspace; ignores CLI output dirs and nested fixture lockfiles so uncovered root hubs are still extracted.CLI / tooling:
socket manifest bazelgains--per-repo-timeout(default 120s,socket.jsonsupport). cquery timeouts detectkilled+SIGTERM/SIGKILL(not a nonexistenttimedOutflag).bazel mod show_extensionfor Maven/PyPI adds--lockfile_mode=off.Reviewed by Cursor Bugbot for commit 632d5e9. Configure here.